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ABSTRACT 

~'~^-""*This  article  surveys  statistical  techniques  which  are  nonparametric 
in  nature  and  used  in  formal  ranking  and  selection  of  populations.  Such 
methods  have  been  developed  only  within  the  last  fifteen  years  and  are 
usually  based  on  rank  scores  and/or  robust  estimators  (such  as  the  Hodges- 
Lehmann  estimator).  The  procedures  surveyed  are  applicable  to  one-way 
classifications,  two-way  classifications,  and  p a i red-comparison  models. 
Computational  methods,  useful  inequalities,  and  appropriate  numerical  tables 
required  to  implement  these  techniques  are  identified  and  discussed. 
Asymptotic  relative  efficiencies  of  the  nonparametric  methods,  compared  to 
their  parametric  counterparts,  are  presented.  Specific  applications  of 
these  methods  (such  as  traffic  fatality  rates)  are  mentioned  and  areas 
for  further  theoretical  and  computational  research  are  identified. 

1.  Introduction  to  Selection  and  Ranking  Procedures 

A  common  problem  faced  by  an  experimenter  is  one  of  comparing  several 
categories  or  populations.  These  may  be,  for  example,  different  varieties  of 
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a  grain,  different  competing  manufacturing  processes  for  an  industrial  product, 
or  different  drugs  (treatments)  for  a  specific  disease.  In  other  words,  we 
have  k(>^  2)  populations  and  each  population  is  characterized  by  the  value 
of  a  parameter  of  interest  6,  which  may  be,  in  the  example  of  drugs,  an 
appropriate  measure  of  the  effectiveness  of  a  drug.  The  classical  approach  to 
this  problem  is  to  test  the  homogeneity  (null)  hypothesis  HQ:  e1  =...=  ok, 
where  e^,...,ek  are  the  values  of  the  parameter  for  these  populations.  In 

2 

the  case  of  normal  populations  with  means  e^,...,ek  and  a  common  variance  o  , 
the  test  can  be  carried  out  using  the  F-ratio  of  the  analysis  of  variance. 

The  above  classical  approach  is  inadequate  and  does  not  answer  a 
frequently  encountered  experimenter's  question,  namely,  how  to  identify  the 
best  category?  In  fact,  the  method  of  least  significant  differences  based 
on  t-tests  has  been  used  in  the  past  to  detect  differences  between  the  average 
yields  of  different  varieties  and  thereby  choose  the  'best'  variety.  But  this 
method  (and  others  related  to  it)  is  indirect  and  does  not  easily  provide 
an  overall  probability  of  a  correct  selection.  Also  the  multiple  comparison 
techniques  developed  largely  by  Tukey  and  Scheffe  arose  from  the  desire  to  draw 
inference  about  the  populations  when  the  homogeneity  hypothesis  is  rejected. 

Selection  and  Ranking  Procedures 

The  formulation  of  a  k-sample  problem  as  a  multiple  decision  problem 
enables  the  experimenter  to  answer  questions  regarding  the  best 
category.  The  formulation  of  multiple  decision  procedures  in  the  framework 
of  selection  and  ranking  procedures  has  been  accomplished  generally  using 
either  the  indifference  zone  approach  or  the  (random  sized)  subset  selection 
approach.  The  former  approach  was  introduced  by  Bechhofer  (1954).  Substantial 
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contribution  to  the  early  and  subsequent  developments  in  the  subset  selection 
theory  has  been  made  by  Gupta  starting  from  his  work  in  1956.  For  more 
details  about  the  numerous  contributions  and  the  related  bibliography, 
reference  should  be  made  to  a  recently  published  book  by  Gupta  and  Panchapakesan 
(1979).  This  monograph  discusses  both  approaches. 

A  Brief  Description of  the  Two  Approa ches 

Bechhofer  (1954)  considered  the  problem  of  ranking  k  normal  means.  In 

order  to  explain  the  basic  formulation,  consider  the  problem  of  selecting 

the  population  with  the  largest  mean  from  k  normal  populations  with  unknown 

2 

means  g^,  i  =  l,...,k,  and  a  common  known  variance  n  .  Let  x.,  i  =  l,...,k, 
denote  the  means  of  independent  samples  of  size  n  from  these  populations. 

The  'natural'  procedure  (which  can  be  shown  to  have  optimum  properties)  will 
be  to  select  the  population  that  yields  the  largest  x..  The  experimenter 
would,  of  course,  need  a  guarantee  that  this  procedure  will  pick  the  population 
with  the  largest  g .  with  a  probability  not  less  than  a  specified  level  P*.  For 
the  problem  to  be  meaningful  P*  lies  between  1/k  and  1.  Since  we  do  not  know 
the  true  configuration  of  the  g.,  we  look  for  the  least  favorable  configuration 
(LFC)  for  which  the  probability  of  a  correct  selection, P(CS), wi 11  be  at  least 
P*.  Since  the  LFC  is  given  by  =...=  ,  the  probability  guarantee  cannot 

be  met  whatever  be  the  sample  size  n. 

A  natural  modification  is  to  insist  on  the  minimum  probability  guarantee 
whenever  the  best  population  is  sufficiently  superior  to  the  next  best.  In 
other  words,  the  experimenter  specifies  a  positive  constant  a*  and  requires 
that  the  P(CS)  is  at  least  P*  whenever  1[|<]"ll[|<_i  ]  '  A*’  where 
denote  the  ordered  means.  Now  the  minimization  of  P(CS)  is  over  the  part  ...*  of 
the  parameter  space  in  which  ^  ^  ■  a*.  The  complement  of  ufir  is 
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called  the  indifference  zone  for  the  obvious  reason.  The  LFC  in  sr*  is 

given  by  upj  =...=  ^  -  A*.  The  problem  then  reduces  to  determing  the 

minimum  sample  size  required  in  order  to  have  P(CS)  >  P*  for  the  LFC. 

Bechhofer's  formulation  can  be  generalized  from  that  described  above. 

His  general  ranking  problem  includes,  for  example,  selection  of  the  t  best 
populations. 

In  the  subset  selection  approach,  the  goal  is  to  select  a  non-empty 

subset  of  the  populations  so  as  to  include  the  best  population.  Here  the 

size  of  the  selected  subset  is  random  and  is  determined  by  the  observations 

themselves.  In  the  case  of  normal  populations  with  unknown  means  p^,...,iJk, 

and  a  common  variance  a  ,  the  rule  proposed  by  Gupta  (1956)  selects  the 

population  that  yields  x-  if  and  only  if  x.  >_  max  x.  -  —  ,  where  d  =  d(k,P*)  >  0 

1  l<j<k  J  /h 

is  determined  so  that  the  P(CS)  is  at  least  P*.  Here  a  correct  selection  is  selec¬ 
tion  of  any  subset  that  includes  the  population  with  the  largest  u . .  Thus,  the 
LFC  is  with  regard  to  the  whole  parameter  space  12.  Under  this  formulation,  for 
given  k  and  P*  we  determine  d.  The  rule  explicitly  involves  n.  In  general, 
the  rule  will  involve  a  constant  which  depends  on  k,  P*,  and  n. 

The  performance  of  a  subset  selection  procedure  is  studied  by  evaluating  the 
expected  subset  size  and  its  supremum  over  the  parameter  space  n. 


Nonparametric  Techniques  in  Multiple  Decision  Theory 

In  the  present  paper,  we  describe  selection  and  ranking  (ordering) 
procedures  which  are  nonparametric  or  distribution-free.  Such  procedures 
have  the  desirable  property  that  their  applicability  is  valid  under  relatively 
mild  assumptions  regarding  the  underlying  population(s)  from  which  the  data 
are  obtained.  Although  the  importance  of  nonparametric  methods  as  a 


significant  branch  of  modern  statistics  is  recognized  by  statisticians, 
modern  nonparametric  techniques  are  usually  restricted  to  hypothesis  testing, 
point  estimators,  confidence  intervals,  and  multiple  comparison  procedures. 
Other  recent  advances  in  nonparametric  tests  can  be  found  in  Hollander  and 
Wolfe  (1973)  and  Lehmann  (1975).  The  development  of  nonparametric  methods 
for  multiple  decision  procedures  is  important  in  statistical  research.  The 
present  paper  deals  with  selection  procedures  with  special  emphasis  on  the 
subset  selection  approach  related  to  the  largest  unknown  parameter. 

Analogous  procedures  (with  proper  modifications)  are  available  for  the 
selection  in  terms  of  the  smallest  parameter. 

In  Section  2,  we  discuss  procedures  based  on  the  ranks  in  the 
combined  sample.  Section  3  deals  with  bounds  on  the  probability  of  a 
correct  selection  associated  with  the  first  two  procedures  R^(G)  and 
(^(G)  of  Section  2.  In  Section  3,  the  exact  and  asymptotic  distribution 
of  the  (appropriate)  statistic  based  on  rank  sums  is  discussed.  In 
Section  5,  we  provide  comparisons  between  and  and  certain  parametric 
procedures  in  terms  of  their  asymptotic  relative  efficiencies.  Selection 
procedures  based  on  pairwise  ranks  are  discussed  briefly  in  Section  6. 

Section  7  deals  with  selection  procedures  based  on  vector  ranks.  In 
Section  8,  procedures  based  on  Hodges-Lehmann  estimators  are  discussed. 
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2.  Procedures  Based  on  Combined  Ranks. 


Let  be  k{^  2)  independent  populations.  The  associated  random 

variables  X.-,  j  =  l,...,n^,  i  =  l,...,k,  are  assumed  independent  and  to  have 

a  continuous  distribution  F„  (x),  where  e.  belong  to  some  interval  ©  on  the 

ei  1 

real  line.  Suppose  F.(x)  is  a  stochastically  increasing  (SI)  family  of 

o 

distributions,  i.e.  if  e1  is  less  than  e^,  then  FQ  (x)  and  FQ  (x)  are  distinct 

and  Fq  (x)  <  F.  (x)  for  all  x.  Examples  of  such  families  of  distributions  are: 

02  -  e1 

(1)  any  location  parameter  family,  i.e.  FQ(x)  =  F(x-e);  (2)  any  scale  parameter 

family,  i.e.  F„(x)  =  F(x/e),  x  >  0,  6  >  0;  (3)  any  family  of  distribution 

functions  whose  densities  possess  the  monotone  likelihood  ratio  (or  TP^) 

property.  Let  R^j  denote  the  rank  of  the  observation  x.j  >n  the  combined 

sample;  i.e.  if  there  are  exactly  r  observations  less  than  x^  then  =  r+1 . 

These  ranks  are  well-defined  with  probability  one,  since  the  random  variables 

are  assumed  to  have  a  continous  distribution.  Let  Z(l)  <  Z(2)  <...<  Z(N)  denote 

k 

an  ordered  sample  of  size  N  =  £  n^  from  any  continuous  distribution  G,  such 

that 

-  °°  <  a(r)  =  E[Z(r)|G]  <  «>  (r  =  1,...,N). 

With  each  of  the  random  variables  X.j  associate  the  number  afR^)  and  define 
1 

H.  =  n:1  l  a(R . , )  (i  =  1 _ k).  (2.1) 

1  1  j-1  1J 

Using  the  quantities  ,  Gupta  and  McDonald  (1970)  have  defined  procedures  for 
selecting  a  subset  of  the  k  populations.  Letting  Oj-^j  denote  the  ith  smallest 
unknown  parameter,  we  have 


F.  (x)  >  FD  (x)  ....  F  (x),  Vx. 


(2.2) 
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The  population  whose  associated  random  variables  have  the  distribution 

F„  (x)  will  be  called  the  best  population.  In  case  several  populations  possess 

°[k] 

the  largest  parameter  value  o^-j,  one  of  them  is  tagged  at  random  and  called 
the  best.  A  'Correct  Selection'  (CS)  is  said  to  occur  if  and  only  if  the 
best  population  is  included  in  the  selected  subset.  In  the  usual  subset 
selection  problem  one  wishes  to  select  a  subset  such  that  the  probability  is 
at  least  equal  to  a  preassigned  constant  P* ( 1  / k  P*  <.  1 )  that  the  selected 
subset  includes  the  best  population.  Mathematically,  for  a  given  selection 
rule  R, 


inf  P ( CS | R )  >  P*, 


(2-3) 


where 


ii  =  lo  =  ( o  i , . . .  0  ^ ) :  iv  t  0,  i  -  1,2,.  ..,kr. 


(2.4) 


The  following  three  classes  of  selection  procedures,  which  choose  a 
subset  of  the  k  given  populations,  and  which  depend  on  the  given  distribution 
G,  have  been  considered: 


Rj (G) : 

Select  it . 

l 

iff  H.  - 
l 

•  max  H.-d  (i  =  1,. 

1  .i  k  J 

Q- 

■  0), 

(2.5) 

R2(G): 

Select  it. 

i 

iff  H. 
l 

■  c  ^  max  H  .  ( i  =  1 , . 

l-j<k  J 

•  • ,k,  c  1 ) , 

(2.6) 

R3(G): 

Select  n . 

l 

iff  H. 

■  0  ( i  =■  1 , . . . ,  k ,  -  - 

<  D  - 

(2.7) 

It  should  be  noted  that  rules  R^(G),  R^G),  and  R^(G)  are  equivalent  if  k  =  2. 
The  procedures  R-j(G)  (and  their  randomized  analogs)  have  been  suggested  by 
Bartlett  and  Govindarajul u  >1963)  for  continuous  distributions  differing  by 
a  location  parameter.  The  procedure  R9(G)  will  be  studied  in  this  paper  only 
for  the  case  where  H.  0  for  all  i.  The  constants  d  and  c  are  usually  chosen 
to  be  as  small  as  possible,  I)  as  large  as  possible,  while  satisfying  the 
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probability  requirement  (2.3).  The  number  of  populations  included  in  the 
selected  subset  is  a  random  variable  which  takes  values  1  to  k  inclusive  for 
rules  R-|(G)  and  The  subset  chosen  by  rule  R^G),  however,  could  possibly 

be  empty.  This  aspect  will  be  addressed  further  at  the  end  of  Section  3. 

It  has  been  shown  by  Gupta  and  McDonald  that  the  infimum  of  P(CS | (G) ) , 
i  =  1,2,3,  over  ft  is  attained  for  e  €  =  { e :  e[-k_i]  =  9[k]*-  This  shows 

that  for  k  =  2  the  infimum  occurs  at  an  equi-parameter  configuration. 

For  k  ^  3  the  least  favorable  configuration  (LFC)  is  not  given  by  the 
equi-parameter  configuration  for  R-j(G)  and  R^G)  as  can  be  seen  from  the  counter¬ 
examples  of  Rizvi  and  Woodworth  (1970).  The  counterexample  distribution  is  a 
mixture  of  two  distinct  uniform  random  variables  and  is  established  for  P* 
near  1 . 

For  the  procedure  Rj(G)  we  can  say  more  about  the  infimum  of  the  probability 
of  a  correct  selection.  The  LFC  is  given  by  the  equi-parameter  configuration 
and  so 


inf  P(CS|R,(G))  =  inf  P(CS|R,(G)), 
w  -'o 

where  =  (e  €  a:  a^-j  =...=  e^}. 

These  selection  rules  are  called  distribution-free  (or  nonparametric)  if 
the  constants  required  for  implementation  are  computed  from  P(CS|R^(G))  =  P* 
for  a  €  Pq .  In  this  case  the  probability  does  not  depend  on  the  common  parameter 
value  and  on  the  underlying  distribution  functions.  The  probability  computation 
is  based  on  a  random  assignment  of  rank  scores. 


3 .  Bounds  on  P ( C S j R ^ (G) ) ,  i  =  1,2 . 

Since  the  exact  LFC  for  the  selection  rules  R-j(G)  and  R^(G)  is  unknown  for 
k  >  2,  it  is  useful  to  have  bounds  for  the  probabil ities  of  correct  selection. 
We  will  assume  n^  =  n,  i  =  l,...,k.  First  consider  rule  R^(G).  Since 


(k-1) 


k-l 

l 


j»l 


max  H / ,  >  •  n 
l  <  j  r  k  - 1  m  " 


N 

l  a(r) 

r=N-n+l 


(3.1  ) 


and 
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T,  H(j> '  V"' 


N 


where  L  d(r)  A,  it  follows  that 

r=1 


inf  P(H(k)  >  v)  <  inf  P(CS|R1 (G))  <  inf  P(H(k) 

The  quantities  u  and  v  are  defined  by 

u  u(d,k,n)  =  [A-nd(k~l )]/nk. 


u). 


and 


-1 


N 


v=v(d,k,n)=n  a  ( r )  -  d . 

r=N-n+l 

For  the  rule  Rp(G),  we  qet  a  similar  expression: 


inf  P(H(k)  >  v')  i  inf  P(CS:R2(G))  *  inf  P(H(|<)  >  u'). 


where 


and 


u'  =  u'(d,k,n)  =  n  1  A[l+c(k-l  )]_1 


,-1 


N 


v’  :  v'(d,k,n)  =  (nc)  /  a(r). 

r=N-n+l 


13.2) 

(3.3) 

(3.4) 


(3.5) 


(3.6) 


(3.7) 


The  important  point  to  note  from  the  inequalities  (3.2)  and  (3.5)  is  that 
the  infima  over  r.  of  expressions  of  the  form  P(H^  K)  are  attained  when 

°[1]  f'[k]- 

For  the  particular  case  when  a(r)  =  r,  nH.  T  ,  the  rank  sum  statistic 
associated  with  n..  Denoting  R.(G)  by  R.  in  this  case,  the  infimum  of 
P(CSjR.)  can  be  related  to  the  Mann-Whitney  statistic.  If  U  is  the  Mann- 
Whitney  statistic  associated  with  samples  of  size  n  and  (k-l)n  taken  from 
identically  distributed  populations,  then 


inf  P(CS  Ri )  •  P(U  nd). 


(3.8) 


A  similar  expression  ran  be  derived  for  .  The  Mann-Whitney  U-st.atistic 
has  been  tabulated  by  Milton  (19f>4;  among  others. 
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k 

Since  £  H/-«  =  A/n,  we  see  that 
j=l  u; 

max  H .  >_  A/nk.  (3.9  ) 

l<j<k  3 

Hence,  a  sufficient,  but  not  necessary,  condition  for  the  selection  rule 
R^G)  to  select  a  nonempty  subset  is  that  P*  be  sufficiently  large  so  that 

0  <  A/N.  (3.10) 

For  large  n,  this  sufficiency  condition  for  rule  R-^G)  is  satisfied  if  P*  > 

For  rule  R^,  i.e.  when  a(r)  =  r,  the  condition  is  D  <  (N+l)/2.  As  an  example, 
with  k  =  3,  n  =  5  the  sufficient  condition  D  ^  8  is  satisfied  for  P*  >_  0.523 
and  for  such  values  a  nonempty  subset  will  be  selected. 

The  evaluation  of  the  constants  D  =  D(k,n,P*)  for  the  rule  can  be 
effected  as  follows: 

P*  <  P(T.  >  Dn)  =  P(U  i  n2(k-  -)-n(0-  |)),  (3.11  ) 

where  now  we  consider  all  populations  identically  distributed.  Hence,  Dn  is 
the  largest  integer  satisfying  the  inequality  (3.10). 

4.  The  Exact  and  Asymptotic  Distribution  of  max  T.-T.  for  Identically 

l<j<k  J  1 

Distributed  Populations. 

In  this  section  the  random  variables  X.j,  j  =  l,...,n^;  i  =  l,...,k,  are 

assumed  independent  identically  distributed  with  a  continuous  distribution 

F(x).  In  this  case  the  are  exchangeable  random  variables  if  n.  =  n,  i  = 

l,...,k.  It  should  be  noted  that  in  a  slippage-type  configuration,  the  constants 

required  to  implement  rules  R^(G),  i  =  1,2,3,  are  determined  from  the  basic 

probability  requirement  P(CS|R^(G))  >  P*  calculated  with  identically  distributed 

populations.  In  the  case  a(R.j)  =  R^j  the  procedures  R.(G)  reduce  to  the  rank 

sum  procedures  R . ,  i  =  1,2,3.  The  distribution  of  the  statistic  max  T.-T,, 

1  1-j-k  J 
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both  exact  and  asymptotic,  is  somewhat  easier  to  obtain  than  the  corresponding 

distribution  of  the  statistic  max  T./T..  For  some  results  concerning  the 

1  ;:j  <  k  J  1 

latter  statistic,  see  McDonald  (1969).  Our  concern  here  will  be  the  former 
which  is  tantamount  to  considering  rule  R^ .  Corresponding  to  rule  R^  is  the 
statistic  T.j ,  the  distribution  of  which  has  been  well-treated  elsewhere  in  the 
Mann-Whitney  format. 

Gupta  and  McDonald  (1970)  have  tabulated  the  quantity  P(T,  >  max  T.-m) 

,  2  J 
for  n  =  2(1)5  and  m  =  0,1,..., 2n  (which  covers  the  entire  distribution). 

Asymptotically  (as  n  °°),  they  show 

05 

P[T.  >  max  T.-m]  /  [$(x+m/z)]k_1  cp(x)dx  (m  >  0),  (4.1) 

llJ<k  J 

where  $(•)  and  co(-)  are  the  cumulative  distribution  function  and  density  of  a 
standard  normal  random  variable,  respectively,  and 

z  =  z(n,k)  =  n[k(nk+l )/12]1/2.  (4.2) 

Integrals  of  the  type 

/  [<J>(x+h2^2)]k~^  cp(x)dx  =  P*  (4.3) 

—  a o 

have  been  considered  in  several  publications.  The  h  quantity  appearing  in  this 
expression  has  been  tabulated  (to  3  dp)  by  Gupta  (1963)  in  Table  I  for 
k  =  2(1)51  and  P*  =  .75,  .90,  .95,  .975,  and  .99.  Similar  values  are  provided 
(to  4  dp)  in  Table  1  of  Gupta,  Nagel  and  Panchapakesan  (1973)  for  the  same  P* 
and  k  =  2(1)11(2)51.  Additional  tabulation  of  h  is  provided  by  Milton  (1963). 

In  Table  IB  of  Milton's  report,  the  h  quantity  is  tabulated  (to  6  dp)  for 
k  =  3(1)10(5)25  and  P*  -  ,3(.05).95,  .975,  .99,  .995,  .999,  .9995,  and  .9999. 

In  Table  II  of  the  same  publication  P*  values  are  given  (t.o  8  dp)  for  h  =  0(. 05)5. 15 
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for  all  the  previously  mentioned  values  of  k.  Thus,  this  asymptotic  value 
can  be  obtained  from  a  variety  of  sources  and  can  be  applied  directly  to  very 
large  data  sets  -  up  to  51  populations  and  any  (large)  sample  size. 

Matching  the  right  hand  side  of  (4.1)  with  (4.3)  yields  an  asymptotic 
approximation  to  m  =  nd  given  by 

m  =  hn[k(nk+l)/6]1/2,  (4.4) 

h  being  the  appropriate  solution  to  (4.3)  corresponding  to  the  given  P*  and 
k.  In  the  selection  rule  the  smallest  integer  not  less  than  m  should  be  taken. 


Approximations  to  the  Constant  m  for  Use  with  . 

We  saw  that  inf  P(CS|R^)  over  ft'  =  {8:  =...=  i  0[k]*  ’s 

attained  when  =...=  e^.  Suppose  we  want  to  evaluate  d  for  which  this 
infimum  is  at  least  P*.  Using  the  rank  sum  statistics,  this  means  that  we 
want  the  smallest  integer  m  =  nd  such  that 


P(Tk  1  T[k]-m>  1  P*  (4-5) 

where  the  T^  are  i.i.d.  random  variables.  McDonald  (1971)  has  discussed  two 
methods  of  approximating  the  solution.  The  first  method  uses  the  asymptotic 
(n  -*  «)  expression  for  the  probability  given  by  (4.1). 

The  second  approximation  is  for  large  P*  (near  1).  Suppose  Z-|,...,Zk 
are  N(0,1)  random  variables  with  the  correlation  matrix  r.  Let 


PfZp-j  1  -<5)  =  P*.  (4.6) 

Dudewicz  (1969)  has  shown  that,  for  large  P*  (near  1),  an  approximation  to  a 
is  given  by 


62  «  - 2 [ 1 og ( 1 -P* )  ] 


(4.7) 
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in  the  sense  that  the  ratio  tends  to  1  as  P*  ■*  1.  Using  his  approximation 

2  -1/2 

and  the  joint  asymptotic  normal  distribution  of  [n  k(nk+l)/6]  (T^-T ^ ) , 

i  =  k-1,  we  obtain  the  approximation 

m2~  .  [n2k  LnJ<iU]  iog(i_p*).  (4.8) 

One  can  also  obtain  this  approximation  from  (4.1)  by  noting  that 

mz  _1  «.  /2  t>_1(P*)  as  P*  -+  1 ,  a  result  of  Rizvi  and  Woodworth  (1970),  and 

using  the  well-known  fact  that 

■t'1  (P*)  «  [-2  log(l-P*)]1/2  as  P*  -  1 .  (4.9) 

The  two  approximations  have  been  compared  by  McDonald  (1971)  in  the  case  of 
P*  =  0.99  for  k  =  2(1)5,  n  =  5(5)25. 

Let  m.|  and  m^  denote  the  approximate  values  of  m  given  by  (4.4)  and 
(4.8),  respectively.  The  numerical  evaluations  of  m-j  and  m2  show  that  (a) 
m^-m-i  increases  in  n  for  fixed  k,  and  decreases  in  k  for  fixed  n,  (b)  m^/m^ 
increases  in  k  for  fixed  n,  and  is  constant  for  fixed  k  over  various  values 
of  n,  and  (c)  both  approximations  are  conservative,  m^  being  more  so  than  m^ . 

For  k  =  2,  McDonald  (1971)  has  analytical ly  shown  that  m^-m^  is  positive  and 
increasing  in  n,  and  that  rii^/m^  is  independent  of  n. 

5.  Comparisons  between  and  R^  and  with  Parametric  .Procedures. 

Recall  that  for  k  =  2  the  rules  R.(G),  i  ~  1,  2,  3, 

are  equivalent.  For  the  special  case  of  rank  sum  statistics  based  on  equal 
sample  sizes,  Gupta  and  McDonald  (1970)  have  studied  the  asymptotic  efficiency 
of  R.|  relative  to  the  means  procedure  of  Gupta  (1956)  for  normal  populations 
and  the  efficiency  of  R^  relative  to  the  procedure  of  Gupta  (1963)  for  gamma 
populations  both  in  the  case  of  k  =  2  populations. 


Let  tt.|  and  be  independent  normal  populations  with  means  0g  and 
e0+9(e  ^  0)  and  common  unit  variance.  Let  R  denote  Gupta’s  means  procedure. 
For  both  R-j  and  R  satisfying  the  P*-condition,  the  asymptotic  efficiency  of 
Rj  relative  to  R  is  ARE(Rj,R;0)  =  1  in^gn^e )/nR  (€),  where  nR(e)  and  nR  (€ ) 

are  the  sample  sizes  for  which  E(S)-P(CS)=£  for  R  and  R^ ,  respectively.  It 
is  shown  by  Gupta  and  McDonald  that 


ARE ( Ri  ,P;e)  =  (5.1 

where 

B2( e )  =  /  <t>2 { x+e )cp( x )dx-4>2 (e/ »^2 ) . 

—co 

As  e  decreases  to  zero,  we  see  that  ARE (R^  ,R;e)  -*•  3/n  =  0.9549. 

Some  exact  calculations  for  the  probabilities  of  choosing  and  n., 
using  rank  sum  procedures  can  be  made  using  Table  C-l  of  Milton  (1970)  for 
e  =  .2(. 2)1.0,  1.5,  2.0,  and  3.0.  This  table  tabulates  the  distribution  of  the 
Wilcoxon  two-sample  statistic  under  the  normal  shift  alternative  specified  by 
0.  As  an  example,  for  k  =  2,  n  =  6,  and  P*  =  .910177,  the  rank  sum  selection 
rules  take  the  form:  select  *.  iff  T..  >  31 ,  i  =  1 ,2.  If  the  underlying 
distributions  are  normal  with  means  0  and  e  =  .2  with  unit  variances,  then 
by  summing  the  appropriate  rows  in  Table  C-l  we  find  P (T ^  ^  31)  =  P(Choosing 

Wj)  i  .8465  and  P(T^  >_  31)  =  P(Choosing  t^)  -  .9518. 

Let  R'  denote  Gupta's  procedure  for  gamma  populations.  Let  the  scale 
parameters  of  and  be  0q  and  0g6,  o  >  1.  In  this  case 


where  now 


ARE(R2,R'  ;G ) 


B2(e)  =  1-20+e)"1  +  (26+1)'1  +  e(2+e)'1-2e2(l+e)'2. 
As  e  decreases  to  1 ,  we  have  ARE(R2,R’;  0)  -*•  3/4. 


(5.2) 


r 
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In  another  paper  Gupta  and  McDonald  (1972),  have  compared  the  procedures 
R^ ,  R2,  and  R^  based  on  rank  sum  statistics  with  a  procedure  R^  which  they 
proposed  for  selection  from  gamma  populations  iri  terms  of  the  guaranteed 
life.  Let  rr..  have  the  associated  density  function 


f(x-ei ) 


1  ri_(x-0-)/A 

[Al’(r)]"1  [(x-Bj  )A]r"'e  1 


0 


x  >  ft  . 

-  i 


elsewhere, 


where  r(>  0)  and  a(>  0)  are  known  common  parameters.  In  life-testing  problems, 
the  parameter  0  is  called  the  guaranteed  life  time.  We  can  assume  with  no  loss 
of  generality  that  A  =  1.  Let  Y^  =  be  bbe  smallest  order  statistics  based 

on  n  independent  observations  from  « . .  It  is  known  that  Y.  is  complete  and 
sufficient  statistic  for  0..  The  procedure  of  Gupta  and  McDonald  for  selec¬ 
ting  a  subset  containing  tne  population  with  the  largest  guaranteed  life  is 

R  :  Select  n.  iff 

m  i 

g  -  y[kj-b-  <s-3> 

where  b  =  b(k,n,P*)  -  0  is  chosen  to  satisfy  the  P*-requirement.  They  have 
shown  that 


inf  P(CS|R  )  -  /  Hk_1 (x+b)dH(x) , 
u  m  0 


(5.4) 


where  H(x)  is  the  cdf  of  Y.  when  .  0. 

11 

In  the  special  case  of  r  --  1,  the  exponential  case.  (5.4)  reduces  to 

inf  P(CS|Rm)  -  k~1(l-w)‘1(l-wk),  (5.5) 


where  w  =  1-e  nl).  For  this  special  case,  Gupta  and  McDonald  (1972)  have  tabulated 
the  values  of  b  for  k  =  5,  10;  n  -  2(1)25;  and  P*  =  0.75,  0.90,  0.95,  0.975,  0.99. 

(  ' 
j  1 


:  { 
i| 
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Consider  three  exponential  populations  with  location  parameters  o.|  =  0, 
e2  =  e3  =  e  —  case»  Gupta  and  McDonald  (1972)  have  compared  the 

expected  subset  sizes  for  the  procedures  R1 ,  ,  R3,  and  Rm  for  e*  =  0(0.1 )1. 5 

and  P*  =  0.6,  14/15.  The  computations  indicate  that  (1)  R-j  and  R£  perform 
equally  well  for  P*  =  0.6,  (2)  and  R^  perform  equally  well  for  P*  =  14/15, 

(3)  E ( S | Rg)  *  E ( S | R^ )  <  E(S|R^)  for  all  e,  equality  holding  for  e  =  0,  (4)  Rm 
performs  better  than  all  the  distribution-free  procedures  for  the  smaller  value 
of  P*,  (5)  for  the  larger  P*,  the  distribution-free  procedures  are  better  than 
Rm  for  e  £0.5,  and  (6)  for  larger  values  of  6(e  >  1.1) Rm  is  the  best  among 
the  four  rules. 

Ofosu  (1974)  has  studied  the  procedure  Rm  and  compares  its  performance 
with  a  procedure  that  excludes  from  the  selected  subset  those  populations  for 
which  Y..  is  sufficiently  below  Y,  the  average  of  the  Y. .  Based  on  a  comparison 
of  the  expected  subset  sizes,  Ofosu  concludes  that  Rm  is  superior  to  the  rules 
based  on  averaged  Y..  in  almost  all  situations.  For  those  rare  situations  where 
Rm  is  not  superior,  it  is  only  slightly  inferior. 

Gupta  and  McDonald  (1970)  compare  the  performance  of  selection  rules  R1 
and  Rj  in  some  Monte  Carlo  studies.  Normal  and  logistic  distributions  with 
variance  unity  were  studied  for  different  configurations  of  their  means.  For 
k  =  3  and  n  =  2,3,4,  these  configurations  were  taken  to  be  (0-1, 0,0),  (0*2, 0,0), 
(0-5, 0,0), (1 -0,0,0),  (2-0,0, 0),  (0-1, 0-1,0),  (0-2, 0-2,0),  (0-5,  0-5,0), 

(1-0, 1-0,0),  (2-0, 2-0,0).  The  number  of  simulations  were  500  or  1000.  The 
logistic  distribution  was  chosen  because  equally  spaced  scores  such  as  ranks 
yield  locally  most  powerful  tests  for  the  location  parameter  of  this  distribution. 
The  constants  d  and  D  were  chosen  to  yield  approximately  the  same  P*  in  the  case 
of  identical  distributions.  Then  the  ratio  of  kP ( CS | R )  and  E(S|R)  was  computed 
for  both  rules  R^  and  R^.  The  bigger  ratio  for  a  rule  indicates  it  to  be 
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better  than  the  other.  For  example,  for  k  =  3,  n  =  2,  then  D  =  2  and  d  =  3 
give  the  probability  14/15  for  the  identical  case.  Using  the  configuration 
(0*1, 0,0)  for  the  normal  means,  the  two  ratios  are  1*012  for  and  1*005  for 
so  that  seems  slightly  better  than  R^.  Using  the  configuration  (0*5, 0,0), 
R3  was  slightly  better  than  R^ ;  the  ratios  being  1  *045  for  R-j  and  1  *049  for  R^. 

These  Monte  Carlo  studies  showed  no  significant  uniform  superiority  of 
either  of  these  procedures.  However,  seemed  to  perform  slightly  better 
than  R^  in  the  cases  where  the  two  highest  parameters  are  equal.  No  difference 
in  the  performance  of  R-j  and  R^  was  noticeable  when  the  distribution  changed 
from  logistic  to  normal.  In  all  cases  the  frequency  of  correct  selections  for 
R-j  was  higher  than  the  theoretical  value  exactly  calculated  for  the  identical 
distributions.  Thus,  there  was  no  indication  that  the  infimum  of  the  probability 
of  a  correct  selection  does  not  take  place  when  all  populations  are  identically 
distributed  as  normal  or  logistic  distributions  under  shift  in  location. 


6 .  Selection  _P ro c edu re s_  Based  on  Pa i rwise  Ranks 

As  noted  earlier  the  least  favorable  configuration  over  ::  for  the 
selection  rule  Rj(G)  is  not  known  and  a  counterexample  exists  showing  that  the 
infimum  of  the  probability  of  a  correct  selection  does  not  occur  when  all 
populations  are  identically  distributed  {Rules  of  the  form  R^ ( G )  do  not  share 
this  difficulty).  Hsu  (1980)  overcomes  this  difficulty  by  constructing  a 
rule  based  on  pairwise  rather  than  joint  ranking  of  the  samples. 


let  R 


Let  denote  the  rank  of  X.  within  X X .  ,X.,,...,X.  ,  and 

J'„  J*  il  in’  jl’  jn’ 

Let 


(i) 


/  R'P  be  the  rank  sum  statistic  for  *• .  compared  to 


P-1 


fD 


,  P  -  l,...,n^-  denote  the  collection  of  nL  ordered  differences  X.  -X.  , 

i  u  IV 


u ,  v  =  1 , . . .  ,n  and  set  U 


JU 

med 


median:  f)J 


i.e.,  0^  is  the  usual  Hodqes- 


k 


:.ji) 


Lehmann  (H-L)  estimator  of  o.-'-..  (or  i  1.....1,  let  M.  k  D'Jj',  where 

?  ;  i  . '  ^  med 
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=  Proce(*ure  Proposed  by  Hsu  for  selection  of  the  population  with 

the  largest  9^  denoted  by  RR,  is  as  follows: 


RR:  Select  iff 


M.  =  max  M.  or  max  r(^  <  r  (P*), 
1  lij<k  3  jj*i  3  n 


(6.1 


where  r  (P*)  is  the  smallest  integer  such  that  Pn[max  R  ( 1  ^  <  r  (P*)]  P*  and 
n  u  j/i  J  n 

PQ(.)  indicates  the  probability  is  computed  assuming  all  populations  are 
identically  distributed. 

The  procedure  RR  does  not  depend  on  the  pairwise  ranks  alone.  However, 

the  contribution  from  the  "M.  =  max  M."  portion  is  small  when  n  is  large, 

l<Jlk  3 

and  is  included  to  insure  that  a  nonempty  subset  is  selected.  The  constants 
rn(P*)  can  be  obtained  from  Steel  (1959)  for  P*  =  .95,  .99,  k  =  3(1)10, 
n  =  4(1)20;  from  Miller  (1966,  Table  VIII)  for  P*  =  .95,  .99,  k  -  3(1)11, 
n  =  6(1)20(5)50,  100. 


Hsu  also  investigates  the  Pitman  efficiency  of  the  RR  procedure  compared 
to  a  means  procedure  (with  common  unknown  variance)  and  shows  it  to  be  the 
same  as  the  Pitman  efficiency  of  the  Mann-Whitney  test  relative  to  the  usual 
t-test. 

Letting  <  dUP  <...<  denote  the  n2  ordered  values  of  D„  ^ , 

-  U)  -  -  (/) 

m  =  rn(P*)  -  n(n+l)/2,  and  d|^  =  k  ^ 

also  suggested  by  Hsu  and  is  given  by: 


Y  d|^P,  an  alternative  procedure  was 
j?i  (m) 


Rr:  Select  it.  Iff 

min(o!mP  -  M.)  >  0  or  max  r[^  '  r  (P*).  (6.2) 

j/*i  vmi  J  j/i  J 

The  subset  selected  by  RR  always  contains  the  subset  selected  by  RR;  however, 
the  two  rules  are  equi-efficient  in  terms  of  their  Pitman  efficiencies  under 
the  location  model . 
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7 .  Selection  Procedures  Based  on  Vector  Ranks. 

In  the  preceding  procedures  R^ ,  R^,  and  R^,  the  statistics  H ■  are  defined 
using  the  ranks  of  the  observations  in  the  pooled  sample.  In  cases  with 
equal  sample  sizes,  vector-at-a-time  sampling  can  be  used  effectively  to 
remove  block  effects,  such  as  in  a  two-way  layout,  and  to  reduce  data  storage 
requirements.  These  procedures  cover,  for  example,  models  of  the  form 


Xij  *  »  ‘  *1  *  »j  *€U-  '  *  ’ . k'  J  -  . . 


<M> 


where  o  refers  to  a  population  effect,  •„  to  a  block  effect,  and  £  to  an  error 
term  with  any  (not  necessarily  normal)  continuous  distribution. 

Let  (xi j  ,X2j , • • • ,Xkj )  be  the  jth  vector  and  R .  ^  be  the  rank  of  X^.  among 
the  k  observations  of  the  vector.  Let  Z(l)  «_  Z(2)  Z ( k )  denote  an  ordered 

sample  of  size  k  from  a  continuous  distribution  G.  Define  a(r)  as  in  Section  2,  i.e. 


and  set 


a (r)  ?  E[Z(r) | G] ,  r  =  1 , . . .  ,k  , 


(7.2) 


McDonald  (1972)  investigated  the  classes  of  procedures  R^(h;G)  and 
R7,(g;  G)  which  are  defined  using  the  two  classes  of  functions  ih(x)>  and 
(g(x)t,  where  h  and  g  are  nondecreasing  real-valued  functions  defined  on 
the  interval  I  =  [b(l),b(k)]  and  h  satisfies  the  additional  property  that 
h(x)  x  for  all  x  c  I.  The  two  classes  of  procedures  are 


R^(h;G):  Select  ^  iff 

h ( J .• )  ->  J r 1. 1 ,  (7.3) 

i 

and 

RJ,(g;G):  Select  iff 

g(Ji )  •  0.  (7.4) 


r. 


20 


Particular  members  of  these  classes  that  are  of  special  interest  are 
R-j(G)  with  h ( x )  =  x+b,  b  >  0,  and  R1,(G)  with  g(x)  =  x-d,  d  real.  Of  course, 
R£(g;G)  can  select  an  empty  set;  however,  the  rule  R^G)  will  necessarily 
choose  a  nonvoid  subset  if  P*  >  .5  and  n  is  large.  The  treatment  of  R^(G) 
and  R^G)  parallels  that  of  R-j(G)  and  R2(G)  described  earlier.  The  infimum 
of  P(CS)  over  fl  is  attained  at  a  point  in  ^  in  the  case  of  R-j(h;G).  However 
as  in  the  case  of  R^(G),  it  is  not  generally  true  that  the  infimum  is  attained 
at  an  equi -parameter  configuration.  But  the  statement  is  true  in  the  case  of 
R£(g;G). 

When  b(r)  =  r,  nH^.  =  T^ ,  the  rank  sum  statistic  associated  with 
McDonald  (1973)  has  discussed  the  related  distribution  of  U  =  max^  ^  T^-T^, 
where  the  distributions  F^  are  identical.  He  has  tabulated  P(U  b)  for  k  =  2, 
n  =  2(1)20;  k  *  3,  n  =  2(1)8;  k  =  4,  n  =  2(1)5;  and  k  *  5,  n  =  2,3.  For 
P(U  <  b)  =  P*  *  0.75,  0.90,  0.95,  0.975,  and  0.99,  he  has  tabulated  the 
asymptotic  value  of  b  for  k  =  2,  n  =  10(5)20;  k  =  3,  n  =  6(1)8;  k  =  4,  n  = 

3(1 )5;  and  k  =  5,  n  =  3. 

The  investigations  of  McDonald  (1973)  with  respect  to  slippage  configuration 
based  on  simulations  show  that  Rj  and  R£  (which  are  R^(G)  and  R£(G),  respectively, 
in  the  special  case  with  b(r)  =  r)  are  roughly  equivalent  when  the  underlying 
distribution  has  a  long  tail  and  the  slippage  is  small,  and  that  is  better 

otherwise.  These  rules  have  been  used  by  McDonald  (1979)  in  an  analysis  of  state 
traffic  fatality  rates  recorded  by  year. 

Lo renzen  and  McDonald  (1980)  further  investigate  the  probability  of  a 

correct  selection  using  rule  R|  by  Monte  Carlo  simulations  covering  a  wide 
range  of  distributions  and  parameter  configurations  (both  location  and 
scale).  In  all  cases  investigated  the  LFC,  i.e.,  the  configuration  minimizing 
P(CS),  appeared  to  be  the  equi -parameter  configuration.  This  suggests  that 
the  practical  inference  corresponding  to  the  selection  procedure  need  not  be 
restricted  to  the  slippage  configurations. 
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In  another  paper,  McDonald  (1975)  considered  the  case  of  three  exponential 

distributions  with  parameters  (guaranteed  lives)  0  =  u.|  <_  =  0  with 

samples  of  size  two.  For  the  rules  R-j ,  R^,  and  R^  (using  rank  sum  statistics 

T.j )  the  infimum  of  P(CS)  takes  place  when  e  =e  =  e^.  However,  it  is  shown 

that  the  expected  subset  size  is  not  bounded  above  by  kP*,  a  property  enjoyed 

by  many  parametric  procedures  [see  Gupta  (1965)]  for  the  location  parameter 

case  under  monotone  likelihood  ratio  conditions. 

Within  the  context  of  a  block  design  (2-way  classification)  Lee  (1980) 

considers  another  type  of  selection  rule  based  on  the  statistics 
n 

Y.  =  l  Y.  ,  i  =  1 , . . .  ,k ,  where 
1  j  =  l  1J 


The  selection  rule  is  stated  as 


Rms:  Select  i.  iff 

Y .  >  max  Y .  -  dMC 
1  -  1-j-k  J  MS 


(7.6) 


where  d^<.  is  the  smallest  nonnegative  integer  required  to  insure  the  probability 
of  a  correct  selection  is  no  less  than  a  prescribed  P*.  The  procedure  is  a 
multinomial  selection  rule  (hence  the  subscript  MS)  designed  to  choose  a 
subset  to  contain  the  population  having  the  highest  probability  of  yielding 
the  largest  observation.  An  analogous  rule  for  choosing  a  subset  to  contain 
the  population  having  the  highest  probability  of  yielding  the  smallest 
observation  is  also  defined. 


2? 


The  constants  required  to  implement  the  procedure  R^  have  been 
determined  by  Lee  (1980)  using  Monte  Carlo  simulation,  assuming  the  underlying 
distributions  are  identical,  for  k  =  49  and  n  =  17.  These  values  were  then 
used  to  select  subsets  of  states  on  the  basis  of  traffic  fatality  rates 
recorded  over  a  period  of  17  years.  Gupta  and  Nagel  (1967)  investigated 
the  least  favorable  configuration  in  a  corresponding  multinomial  formulation 
and  concluded,  based  on  some  numerical  case  studies,  that  the  identically 
distributed  case  appears  least  favorable.  Panchapakesan  (1971)  proved  that 
the  identically  distributed  configuration  is  asymptotically  least  favorable. 


8.  Selection  Procedures  Based  on  Hodqes-Lehmann  Estimators 

Let  X..  (j  =  l,...,n;  i  =  1,2,.  ..,k),  k  >  2,  be  independent  random 

observations  from  k  populations  with  continuous  cdf's  F(x-e^),  i  =  l,2,...,k, 

2 

with  common  variance  a  =1.  The  following  problems  have  been  considered  by 
Bechhofer  (1954)  under  the  normality  assumption: 

(i)  Select  a  "good"  population,  the  ith  population  being  regarded  as 
good  if  9^  i  6[k]“A’  for  some  Preassi9ned  A  >  0(i  =  1 ,2, . . .  ,k-l ) ; 

(ii)  select  the  best  t  populations,  i.e.,  the  populations  with  location 
parameters  6[k-t+l]’'  ‘  •  ,0[k]  w''t*1out  regard  to  order; 

(iii)  select  the  best  t  populations  with  regard  to  order. 

His  approach,  now  known  as  the  "indifference  zone"  approach  selects  the 
"best"  populations  with  a  guaranteed  minimum  probability  P*  (preassigned)  of 


correct  selection  when  (9^,...,0jJ  lies  in  a  subset,  say  u'  of  the  parameter 

k 

space.  The  region  is  called  the  preference  zone  and  R  -:z *  is  the 
indifference  zone.  Some  of  the  procedures  discussed  earlier  use  rank 
statistics  for  selection  purposes. 


However,  when  formulated  for  the  problems  discussed  in  this  section,  the  slippage  con 
figuration  of  parameters  defined  by  the  indifference  zone  is  not  necessarily  the  LFC. 
The  slippage  conf iguration  as  pointed  out  by  Puri  and  Puri  (1969)  is  least 


favorable  when  the  parameters  satisfy  the  relation  =  o(n~2) 

for  all  1  <_  i ,  j  £  k,  i  f  j . 

Ghosh  (1973)  has  proposed  alternate  procedures,  based  on  one-sample 
Hodges-Lehmann  estimators  of  e. 's  under  the  additional  assumption  that 
F  is  symmetric  about  the  origin.  Ghosh's  procedures  give  in  all  these  cases 
least  favorable  configurations  for  finite  n  without  needing  any  restriction 
on  the  parameters. 

Gupta  and  Huang  (1974)  have  proposed  some  procedures  to  select  a  subset 
of  the  given  k  populations  which  is  guaranteed  to  exclude  all  bad  populations 
with  probability  not  less  than  some  preassigned  P*. 
l  ^ 

Let  =  2  +  l  udX.jl-jX^I),  j  =  1,2 . n,  i  =  1,2 . k,  where 

1 

u(t)  =  1 ,  j ,  or  0  at  t  > ,  - ,  or  <  0.  Thus  is  the  rank  of  |X^|  among 
! Xi 1 1 , . . . , | Xin i  (1  <  i  i  k;  1  ^  j  <  n).  Let  X!  =  (X^  , . . .  ,Xip) .  Consider 
the  one-sample  signed  rank  statistics 


h(X. )  =  J  sgn( X .  . ) E J ( U 
j  =  l  13 


(8.1) 


i  =  1,2,. ..,k,  where  sgn(t)=  1,0  or  -1  according  as  t  >,  =,  or  <  0; 

U  <  U  „  <...•  U  are  the  n  ordered  random  variables  from  a  rectangular 
ni  ■-  nZ  —  nn  3 

(0,1)  distribution,  and  J(u)  =  4~\-^),  where  i p( x )  is  the  df  of  a  random 
variable  satisfying  i j/(x)  +  y(-x)  =  1  for  all  real  x. 

The  one-sample  H-L  estimators  are  given  by 
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i  =  1,2,...  ,k,  where  6^  =  sup{a:  h(X^-aln)  >  01,  =  ''nf{a: 

h(?i~a!n)  >  Oi >  In  =  (1....J)  is  an  n-tuple  with  all  elements  1. 

All  these  statistics  and  estimators  depend  on  n.  The  following 
property  of  location  invariance  (see  Hodges  and  Lehmann  (1963))  is 
satisfied  by  these  estimators: 

+  cln)  »  0^)  +  c,  (8.3) 

i  =  l,2,...,k,  c  being  any  constant.  In  the  particular  case  when  J(u)  =  u 
or  x^(u)  (the  inverse  of  a  chi-distribution  with  one  degree  of  freedom)  the 
statistics  become  the  Wilcoxon  signed-rank  or  normal-score  statistics.  In 
the  former  case 

X .  .  +  X . 

0, (X- )  =  med  >  1  =  1.2,. ...k. 

IjLJij'in  L 

Let  Qp-j  <_  0^2]  !•••£.  °[(c]  denote  the  ordered  estimators  and  let 
be  the  unknown  estimator  associated  with  6j--j  (1  <_  i  <  k). 

An  Elimination  Type  Procedure  to  Select  a  Subset  Excluding  All  "Strict! ly  _Non  t 
Best"  Populations 

Let  d(e.,6j)  be  a  suitable  distance  measure  between  8^  and  o  ^ ;  the 
population  7^  is  "strictly  non  t  best"  if  d(0[|<_t+i  ]»ei )  =  e[k-t+l]'0i  " 
where  A  is  a  given  positive  constant.  Let  m  denote  the  unknown  number  of 
"strictly  non  t  best"  populations  in  the  given  collection  of  k  populations. 
Clearly,  we  have  0  *  m  <  k-t.  Let  =  ie:  0fl]  i-.-i  •  °[k.t+i]-A  1.  °[m+l] 

-  9[k-t+l]  °[k]]* 


k-t 

ij  um. 

n  m 

m=0 


Then 


Let  CO  stand  for  a  correct  decision,  which  is  defined  to  be  the  selection 


of  a  subset  which  excludes  all  the  "strictly  non  t  best"  populations. 

Gupta  and  Huang  (1974)  define  the  rule  R  as  follows: 

R:  Reject  n ■  iff 

Gi  <  6[k-t+l ]  -  A  +  di  (0  d1  <  A).  (8.4) 

The  constant  d-j  is  chosen  to  be  the  smallest  number  such  that 
inf  Pq(CD|R)  >  P*. 

Gupta  and  Huang  (1974)  have  shown  that  P0(CD|R)  is  a  nonincreasing  function 
of  Oj-^j  (i  =  l,...,m)  and  a  nondecreasing  function  of  (i  =  m+l,...,k). 

Hence 

inf  P  ( CD j R )  =  inf  inf  P  (CD|R). 

-  0<m<k-t  9€fi  - 

-  m 

It  is  known  that  if  e^’s  are  true  values  of  the  parameters,  then  under 

some  regularity  assumptions  /n  (e^(X^)-e.)  B ( F ) / A  tends  asymptotically 

(as  n  -+  °°)  to  Y.  with  N(0,1)  where  A2  =  \  /V(u)du,  B(F)  =  /  J(2F(x)-l  )dF(x) . 

l  0  Q  ax 

These  statistics  Y^'s  are  mutually  independent.  This  leads  to  a  lower  bound 

on  the  infimum  of  the  probability  of  a  correct  decision  for  large  n  as 
follows: 

.  f  OO 

inf  PQ (CD | R)  >  t—jf  $k-t(x+d/^)$r(x)[l-$(x)]t-r-1c£,(x)dx,  (8.5) 

0vl  U  “  -oo 

where  r=min(t,  k-t-1),  d  =  /0. 864  d^  (or  d-j )  for  the  Wilcoxon  (or  normal 
score)  case.  For  the  case  F(x)  =  (x) ,  then  using  normal  scores  the  inequality 

(8.5)  is  an  equality  and  the  result  agrees  with  that  obtained  by  Carroll, 

Gupta  and  Huang  (1976). 


i 


N 


It  has  also  been  shown  that 

1 im  inf  PQ ( CD | R( n ) ) 
rv«°  0£Q 

=  lim  -rrrj— Try  *  /  *k't<x+  ^jp  d, /n>r(x)[l-*(x)]t_r"1cp(x)dx 

f)-Xo  '  '  ‘  -oo 

»  1,  since  d-j  >  0  , 

so  that  the  sequence  of  rules  { R ( n ) }  is  consistent  wrt  n. 

Since  the  cdf  of  each  e.(X^)  is  stochastically  nondecreasing  in  e 
it  follows  that  for  every  e  €  n  and  1  <  i  <  j  £  k. 

P0{R(n)  rejects  >  P0{R(n)  rejects  n^}, 

and  thus  R(n)  is  a  so-called  monotone  procedure. 
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